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Abstract 

We study the average leaf-to-leaf path lengths on ordered Catalan tree graphs 
with n nodes and show that these are equivalent to the average length of paths 
starting from the root node. We give an explicit analytic formula for the average leaf- 
to-leaf path length as a function of separation of the leaves and study its asymptotic 
properties. At the heart of our method is a strategy based on an abstract graph 
representation of generating functions which we hope can be useful also in other 
contexts. 


1 Introduction 


The aim of this document is to find an analytic expression for the average path lengths 
between leaves in full binary trees. Similar such tree structures have been used as an 
analogue for connections in the Hilbert space of wavefunctions in strongly disordered 
Heisenberg chains j4) , where the path lengths are believed to be related to correlation 
functions |2|. More generally, it has been argued recently that certain RG approaches to 
the Hilbert space of critical many-body interacting system in d dimensions share many 
of their geometric properties with d + 1 dimensional AdS (9,15 


This connection is a 


tions m 


manifestation of the so-called AdS/CFT correspondence |7| and its possible applicatic 
condensed matter physics |8|. In these quantum systems, the leaves are ordered according 
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to their physical position, for example the location of magnetic ions in a quantum wire. 
This ordering imposes a new restriction on the tree itself and the lengths which become 
important are leaf-to-leaf path lengths across the ordered tree. We emphasise that the 
path lengths therefore correspond to quite different measures than those studied in the 
traditional graph theory and computer science approaches usually concerned with tree 
graphs 


13 


The construction of this tree of connections in Hilbert space is based on details of 
so-called tensor network methods 10 11 , most famous of which is the variational matrix- 


product state (vMPS) method that underlies modern density matrix renormalisation 
group (DMRG) algorithms 12 14 18 . These provide elegant and powerful tools for the 


x 2 -xi\ 
as 
leads to 


simulation of quantum many-body systems in low dimensions. In loop-free such tensor 
networks, correlations scale as e~ a *( Xl,X2 \ where £(xi,x 2 ) is the number of tensors that 
connect leaf x\ to leaf x 2 |2|. Typical vMPS methods for gapped systems give i 
and correlations scale exponentially. On the other hand a path length i ps log \x 2 — X \ 
found in the multi-scale entanglement renormalisation ansatz (MERA) j[T, 16 [lT 
a power law decay ~ e -aiog|x 2 -xi| ^ _ X i\~ a . 

Different quantum systems give rise to different tree structures, and these in turn lead 
to different distance behaviours for the path lengths. Hence it is interesting to ask more 
generally about the path lengths of different tree graphs. In Ref. [3], we had studied the 
case of full and complete m-ary tree graphs using a recursive approach. We found explicit 
expressions in terms of Hurwitz-Lerch transcendants. For incomplete random binary trees, 
we presented first numerical results. In the present manuscript, we extend our results 
to full binary trees (Catalan graphs), i.e. we drop the requirement of completeness. An 
example of path lengths in such Catalan trees is given in Fig. [l] The absence of paths that 
would be present in a complete binary tree seems similar to the more random structure 
seen in random binary trees [3]. We might hence expect that Catalan trees should in 
many ways be similar to the tree structure found in Ref. |4|. 

We consider a tree graph where every internal vertex has exactly two children as shown 
in Fig. [2| Such a tree is usually called a full binary tree. Let us define n as the number of 
internal vertices in the graph. We shall refer to such internal vertices as nodes or simply 
vertices. The root is the top vertex of the tree. It is unique as it is not child of any vertex. 
We shall call a terminal vertex with no children a leaf. A sub-tree rooted at vertex v is 
the set of vertices and leaves that descend from v, including v itself. A primary sub-tree 
is a sub-tree rooted at one of the children of the root node. 

For fixed n, there are a finite number of unique trees as shown in Fig. [2] This set of 
trees can be decomposed in terms of primary sub-trees with p (left) and q (right) vertices 
where p + g + l = nas shown by the dashed boxes. This allows us to use a convenient 
diagrammatic decomposition as given by Fig. [3] The number of trees in each subset is the 
number in the left subtree, C a , times the number in the right sub-tree, C n - a -\. When 
summed over all n of the subsets we obtain 


n— 1 


c n = j2 c * c ', 


n—a —1 • 


a=0 


(i) 
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Figure 1: A Catalan tree graph with various definitions as defined in the main text. 
Circles (•, o) denote vertices while lines indicate edges between the vertices of different 
depth. The tree as shown has n — 15 and 16 leaves (o). The indicated separation is r = 6 
while the associated leaf-to-leaf path length equals £ = 6 as indicated by the thick line. 
The thick green line denotes the path. 


This expression is know as Segner’s Relation | 6 |. It defines the permissible values of 
the CnS, which are the celebrated Catalan numbers. We have therefore shown that the 
number of full binary trees with n vertices is given by the Catalan number C n . Similarly, 
it is straightforward to show that with L denoting the number of leaves, then L — n + 1. 
For the examples given in Fig. [2j we see that for n — 1, 2, and 3, we have C\ = 1, = 2 

and C 3 = 5 unique trees with 2, 3 and 4 leaves in each such tree. 


2 Catalan numbers, generating functions and ordered 
Catalan trees 


2.1 Basic properties of Catalan numbers 

The theory of Catalan numbers and their applications are well-developed | 6 |, so we only 
briefly cite some of the results we shall be needing in the following. Catalan numbers can 
be computed explicitly using well-known relations such as, e.g. 


C n = 


'2 n 

n + 1 V n 


(2ra)! 


4n — 2 


(n + l)!n! n + 1 


C n - 1 . 


( 2 ) 


In the following, we shall make extensive use of generating functions |6,19 . Generating 
functions offer a convenient way to manipulate relations between the set of Catalan num¬ 
bers. More generally, they serve as a very useful formal device for manipulating a series 


and for determining the properties of unknown series 19 . Given a general (infinite) series 
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n = 1 



n = 3 








Figure 2: The set of all full binary trees with n = 1,2 and 3 vertices. The dashed 
boxes denote the sub-trees that make up the full set and the number in the box 
denotes n of the sub-tree. Note that in this example all dashed boxes correspond 
to primary sub-trees as defined in the text. 


a 0 , ai, a 2 , a 3 ..., it is always possible to formally define a function 

OO 

a(x) = a 0 + a j x + a 2 x 2 + a 3 x 3 + • • • = a n x n , (3) 

n=0 


which we call the generating function of the sequence a n . We can also invert the above 
definition and instead say that a n is the coefficient of x n in the Taylor series representation 
of a(x) about zero. Following Ref. 19 , we will employ the notation a n = [x n ]{a(a;)} . 


For the Catalan numbers, we can define its generating function as 


C( X ) = Y J C 'nX n . 

n=0 

For later use, we will write the inverse relation as [19] 

C n = [x n }{C(x)}. 

Squaring Q gives, 

C 2 (x) = Cq + (CqCi + C 1 C 0 )x + • • • + (CoC n + C\C n —\ + • • • + C n Co)x n + ... 
= C\ + C 2 x + C%x 2 + • • • + C n +ix n + ... 

_ C(x) - Co 

? 

X 

and we have used Eq. (|TJ) in each term. Multiplication by x results in 

C(x) = xC 2 (x ) + 1 , 


(4) 

(5) 

( 6 ) 

(7) 


which will turn out to be a useful form for the generating function of the Catalan numbers. 
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Figure 3: Diagrammatic decomposition of the set of trees with n vertices in terms 
of primary sub-trees whose vertices add up to n — 1 . 

3 Routed paths in ordered Catalan trees via explicit 
construction 

The first step in our quest to compute the path lengths between different leaves is to 
consider the length of the path from a leaf to the root node. Because such paths connect 
to the root node, we shall call them rooted paths. We start by defining m as the horizontal 
position of a leaf in a graph, starting with m — 1 for the left-most leaf. The lines in the 
graph cannot cross (cp. Fig. [2]) and working from the root, all children of the left sub-tree 
are always to the left of children of the right sub-tree. This is true at all levels and for 
all sub trees. We call the depth of a leaf the number of internal vertices that connect the 
leaf to the root note (including the root node). Next, we define the depth function D m>n 
as the total number of vertices that connect the leaf m to the root when summing over 
all possible graphs with n vertices. 

3.1 The depth function of the first and second leaf 

The set of trees with n vertices can be decomposed in terms of trees with fewer vertices 
as shown in Fig. [3] Hence the sum of the depths of the first leaf can be expressed in terms 
of the sub-trees. Then Di )U is equal to the sum of the first leaf depth of the left hand 
sub-tree multiplied by the degeneracy of the right sub-tree, plus the number of vertices 
that connect the sub-trees, 

n—1 n—1 

D\ t n = fcCfc + C n _i_kCk\ = + C n . (8) 

k =0 k =0 

Here we have used Segner’s relation ([Tj) to simplify the final result. In the following, we 
shall denote such equations that implicitly define terms on the left-hand side by sums over 
combinations of Catalan numbers on the right-hand side as master equations. In order to 


5 
















( 9 ) 


solve this first master equation, we define a generating function as follows, 

OO 

I \{x) = Y J D i,r, 


T.X 


n =0 


with Di o = 0. Multiplying by C(x) gives us an expression of the form 

Di (x)C(x) = D lt0 C 0 + (DifiCi + D ltl C 0 )x + • • • + {Di$C n + DyiCn-i + • • • + Di tn Co)x n + 
— {Du ~ Ci) + {D12 — C2)x + • • • + (D hn+1 — C n+ i)x n + ... 


J2Dl,n+lX n ~J2 C n+lX n 

n =0 n =0 

{Di(x) — Di t0 ) — (C(x) — C 0 ) 


x 


Collecting the T>i(x) terms and using Eq. Q gives 

V^x) = C 2 (x) + {D h o - C 0 )C(x) = C 2 (x) - C(x). 
This can be put back into the form of a generating function 

OO 

Vi{x) = Y J \ C n + l-C n ]x n . 

71=0 


( 10 ) 

( 11 ) 


( 12 ) 

(13) 


Therefore we have 

Dl : n Cn- 1-1 C n . 

The second leaf on the (0, n — 1) subset is no longer solely in the left hand sub-tree as 
shown in Fig. [4] where the n vertex tree is decomposed in terms of the primary sub-trees. 
Therefore the master equation for the second leaf depth is 

n— 1 

D2,n — D 2) kC n -i-k + Di )n _i + C n . (14) 


k =1 


As before, we create a generating function for the depth, 

OO 

Vo{x) = J2 D 2,r. 


i.X 


(15) 


n =1 


The sum starts at 1 to reflect the fact that the second leaf depth is of course not defined 
for a tree with a single leaf. As with the first leaf, we multiply by C(x ) and compare with 
the master equation 

V 2 (x)C(x ) = D 2 } iCoX + {D 2 aC\ + D 22 Cq)x 2 + • • • + {D 2 ]C n _\ + • • • + D 2 n Co)x n + ... 
= {D 2)2 — Di t i — C 2 )x + • • • + (D 2 , n +l — Di >n — C n +i)x n + . . . 


X] D 2 , n+l x n - Di, n x n -J^c r 


,+lX 


n= 1 


71=1 


71=1 


V 2 (x) - D 2 ^x 


x 


— [D i(x) — -D lj0 ] — 


C{x) - C lX - C 0 


x 


(16) 
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Figure 4: Schematic representation of the decomposition of the set of trees with 
n vertices into trees with fewer vertices. The second leaf is highlighted showing 
that in the (0, n — 1) subset, it is located in the right hand sub-tree. 


From Eq. Q we can write D 2 (x) = xC{x)V 1 {x) + C 2 (x ) — C(x). and also x'D 1 (x) = 
C(x) — xC(x) — 1. Using this in Eqn. (Jl6|) gives 



V 2 {x) 

= 2 C 2 (x) — 2 C(x) — xC 2 (x). 

(17) 

Writing it 

as a summation 




OO 

V 2 {x) = Y, [2CWi - 

n=0 

OO OO 

2 c n \ x n ~Y C ^ n = Y. [2^+1 - 3C -] 

n=1 n= 1 

(18) 

results in 





D 2 ,n = 2C n+ 1 — 3 C n . 

(19) 


3.2 The depth function of the rath leaf 


We begin by making a master equation in a similar manner to the previous examples. 


In (14) we noticed that there was one diagram in the decomposition where the 2nd leaf 
misses the left hand primary sub-tree. For general m there will bem-1 such cases. When 
the left sub-tree has k = 0 to m — 2 vertices the contribution to the depth is from the leaf 
m — k — 1 of the right sub-tree because the left sub-tree has k + 1 leaves. The degeneracy 
of this term is given by the number of trees in the left block C*,. When k > m — 2 the 
contribution is from the mth leaf of the left block as before. Hence, in full the master 
equation can be written as 


Dm,n C'n 


n— 1 

£ 

k=m— 1 


m—2 


Dm,k C n -l-k + £ Dm—k—l,n—k—l a,. 


( 20 ) 


k =0 


Before proceeding any further, we note that the depth function is left-right symmetric 
in that the depth of the mth leaf from the left is the same as the depth of the mth leaf 
from the right, D nhn = D n+ 2 - m ,n- This can be seen be noting that in the same way as 


Eq. (20), a master equation can be found for the pth leaf from the right 


n—p 


D 


n+2— p,n Cn + y ^ -D 


n-k—p-\-l,n—k—lC'k 


k =0 


n—1 

^ ^ D n +2—p,kC'n—k—l’ 

k=n-p-\-l 


( 21 ) 
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The first sum is where the (n + 2 — p)th leaf is (n — k — p + 1) in the right sub-tree. 
The second sum is where the left sub-tree is large enough to have n + 2 — p leaves. The 
degeneracies are given by the number of trees in the opposite block as before. Setting 
m — n + 2 — p recovers Eq. (20) as desired. 

Just as we saw for m = 2, the generating function for general m needs to start at 
m — 1, i.e. 


^ ^ D m7 


X 


( 22 ) 


n — m — 1 


Starting from the master equation (20) for D m)Tl+rn , we have 

72+772— 1 m —2 


D 


772,72+722 


Cn-\-m “1“ ^ ^ D rn ,kC n -\- rn —k— 1 “t - ^ ^ Dm—k—l,n-\-m—k—lCk' 


(23) 


k = m—l 


k =0 


After some algebra, we find 


m — 1 

V m (x) = C 2 (x) - C(x) CnX n + C(x)D 1>m _ 1 x rn ~ 1 

72—0 

772—2 

+ C{x) Y2 C k x k+1 [V m . k .^x) - D hm - k - 2 x m - k ~ 2 ] . (24) 

k =0 

This means that we only need to know the depth of the first leaf to create the generating 
functions for the rest, so 


772—1 

V m {x) = C 2 Or) - C(:r) C n x n + C(x)(C m - C m . Ox"*" 1 

n=0 

m —2 

+ C(x) C k x k+1 (X> m _ fc _!(x) - (<+_*_! - C m . k . 2 )x m - k - 2 ) . (25) 

k=0 

Using Segner’s relation Q we notice that for m ^ 2, Y^k=o C k C m - k - 2 = C m -i and 
Z)r=o 2 CkCm-k -1 = Er=o~C'fe C 'm-fe-i - C^-iCo = This enables us to simplify 

V m (x) to 

772—1 772—2 

V m (x) = C 2 (x ) - C(x) + C(x)C m _ lX m - 1 + C(+ ]T C k x k+1 V m _ k ^{x). (26) 

n =0 fc =0 
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As m ^ 2 this simplifies further 


m—2 


jm—1 


V m (x) = C' 2 (x) ~ C{x) ^2 C n xU ~ C(x)C' m _ 1 X" 

n =0 

m —2 

+ C(x)C' ro _ 1 a; m - 1 + C(x) ]T C' fc x fe+1 P m _ fe _ 1 (x) 

k=0 

m —2 

P m (a;) = C 2 (x) + C(x) C n i" [xD m _ n _i(x) - 1] . (27) 


n=0 


We now introduce the ansatz that will take us to the solution. We will assume that 
the generating function V m (x) decomposes as 


"D m {x) = f m {x)C{x) + g m (x), 


(28) 


where f m (x) and g m (x) are polynomial in x of order m — 2. The ansatz is motivated by 
the observation that for m ^ 2, D rn n is composed of a sum of Catalan numbers with 
indices from n + 1 down to n — (m — 2). The form of f m (x)C(x) gives the expression for 
D m ,n- When putting it in the form of ( |22| ) there is also a slew of terms that are cancelled 
by g m (x). After some more algebra (see Appendix |C.1[ ), we find 

m —2 

D m , n = mC n+ 1 - C n - 2 '22(m - k - 1 )C k C n - k . (29) 

k =0 


Using (A. 14), we can show that this expression in explicit form gives 

2(2 n + 1 )mC n 


Dm,n 1 Cn 


n + 2 


+ 


m(m + l)C m (2n — 2m + 3)(n — m + 2)C n _ m+ i 
(n + l)(n + 2) 

2 m(m + l)(2n — 2?n + l)(2n — 2m + 3) 


(n + l)(n + 2) 


n n _ n 

^m^n—m v - / n • 


(30) 


4 Leaf-to-leaf path lengths in an ordered Catalan 
tree 

We denote the total number of vertices that connect two leaves of separation r when 
summing over all possible trees with n vertices as summed leaf-to-leaf path length S n (r). 
Similarly, A n (r) indicates the average leaf-to-leaf path length. For brevity, we shall hence¬ 
forth refer to leaf-to-leaf path lengths simply as path lengths. 
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II 

[n — a) 1 


Figure 5: Schematic of splitting an n vertex tree into a single vertex, an n — a — 1 
vertex tree and an a vertex tree. A possible path from leaf n — a on the n — a — 1 
vertex tree to the 1st leaf on the a tree is indicated as dashed line. 


4.1 Leaf-to-leaf path length for separation r — 1 

Consider splitting an n vertex tree into a single vertex, an n — a — 1 vertex tree and an 
a vertex tree as shown in Fig. [5j For the a vertexed tree, the summed path length is 
simply S a ] for the n — a — 1 vertexed tree, the summed path length is S r The path 
connecting the two trees, i.e. the (n — a) th leaf of the n — a — 1 vertex tree connecting 
to the 1 st leaf of the a vertex tree, requires more discussion. The summed root-to-vertex 
depth of the first leaf of the a vertex tree is Di a . We next consider just a single path, 
with depth d±, within the set of paths that connect leaf n — a to the root in the n — a — 1 
vertex tree as shown by the dashed line of Fig. |5j As there are C a possible trees with 
a vertices, d\ will contribute C a times in the sum of all connecting paths. Similarly the 
root will contribute lxC a . The total length for paths connecting to the first leg of the 
a tree containing d\ is D la + C a + C a d\. To obtain the complete summed path length 
we now sum over all C n - a -\ paths that connect leaf n — a to the root in the n — a — 1 
tree (eh, d 2 , ..., d Cn _ a _J 


[Di + C a + C a d \] + [D i iQ + C a + C a d^\ + • • • + [Di tCl + C a + C a d n _ a _ i]. (31) 


Cn — ol — 1 


As 'Yh dk = D n _ a n _ a _ i, the summed path length becomes 


k =o 


C' a Dn—a,n—a—l T C n —a—lldl : a T C n — a —\C a . 
The inclusion of contributions from all a yields the equation 


(32) 


n—1 


^ ^ \C a D n — a ^ n —a— 1 T C n — a —\D\ a -\~ C n — a —\C a 


Q = 0 


+ *S'n-o-l(l)C' Q , + aSq,( l)C' rl ,_ Q ,_i] . 


(33) 
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It is possible to simplify equation (33) (see Appendix C.2) to give 


n—1 


71—1 


Sn( 1) = 2 Cn+1 - 3 C n + 2 ^2 SaWCn-a- 1 = D 2>n + 2 ^ 5 Q (l)C, 


n—a —1 • 


(34) 


a=0 


a=0 


In order to progress further, we again use the generating function approach. We define 
the generating function for r = 1 as 




(35) 


n=0 


with Sq = 0. Then we can write 


5(1, x) = ]T 5 n+ i(l)x n+1 = 2 ^ C n+2 x n+1 - 3 ^ C n+1 x n+1 + 2 ^ £ 5 a (l)C r 


- n x 


n +1 


n =0 


n =0 


n=0 


n=0 a=0 
oo n 


X 


J2c n x n - Co - Crx - 3 [J2 C nX n - Co + 2x^^5 a (l)CV 


n-OL a 


\n =0 


^ 71—0 


n =0 a=0 


[C(x) - 1] - 2 - 3 C(x) + 3 + 2x EE H q C q x q S a ( 1) 

q=—a a=0 
oo oo 

[C(x) - 1] + 1 - 3 C(x) + 2x EE C q x*S a (l)x a 


q =0 a =0 


= — [C(x) — 1] + 1 — 3 C(x) + 2x5(1, x)C(x), 


x 


(36) 


where H q is a step function (H q = 1 if q ^ 0, 0 if p < 0). We solve for 5 and rewrite 

I {C(x) - 1] + 1 - 3C(x) 2C(U 2 - 3 C(x) + 1 

1 — 2 xC(x) v I — 4./' 

OO 

— ^ ] [2(n + l)C'n+i — 3(2n + l)C n + (n + 1 )C n ] x n . (37) 

n=0 

where in the first line we have used xC(x) 2 = C — 1 and is further simplified using the 


expressions found in Appendix C.3 The nth term, 5 n (l) = [x n ]{5(l, x)}, is then 

Qai 2 (~i 

S n { 1) = 2 (n + 1 )C n+1 - (5 n + 2)C n = 


n + 2 

The total number of paths is nC n so finally the average path length becomes 

3 n 


A n (l) = 


n + 2 


(38) 


( 39 ) 


In the limit n —> oo, we have A n ( 1) —> 3. 
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Figure 6: (a) The connecting paths for r = 3, shown (a) away from the edges of 
the tree and (b) at the edge of the tree. The leaves highlighted in red are those 
which have no connecting path due to the tree edge. 


4.2 A recursive approach to summed leaf-to-leaf path lengths 

As in section |4. 1 we begin with the set of trees decomposed into an a and n — a — 1 vertex 
tree. Similarly, we order the full expression into terms originating from possible paths 
within each of the sub-trees, which we shall call the recursion terms, and those paths that 
connect the sub-trees. 

The recursion term [*S' rfc (t)] rec is simply the contribution to the summed path length 
from the two sub-trees S n - a -i(r ) and S a (r) with appropriate degeneracies as before, 


71—1 

[S n (f )]rec = 5^ [C a S n . a ^(r) + C n -a-lS a (r )] 

a=0 

71—1 

^ ^ [Cn—i— a S n — (tj— i— a )_i(r) -j- C n — a —iS a (r)J 

O! = 0 
71—1 

= 2 J2 C n-a-iS a (r), (40) 

a=0 

where in the second line we are effectively summing from n — 1 down to 0 in the first 
term. 

The connecting path is somewhat more complicated. In the case of nearest neighbours 
there is only one path that connects the two subtrees, while for general paths there are r 
connecting paths. Figure [6] shows the three connecting paths for r = 3. We observe that 
this is only true for certain values of a; near the edges of the tree the boundary prevents 
the existence of all r connecting paths as shown in Fig. |6](b). We shall thus refer to the 
terms away from the edges as the bulk terms and the terms affected by the edge as the 
boundary terms, both of which shall be treated separately. 

The bulk terms are the simplest ones to create an expression for. They can be viewed 
as generalisation of the nearest neighbour case. As before, the length of each connecting 
path is the addition of the depth of the leaf in the left hand sub-tree, the root node and 
the depth of the leaf in the right sub-tree. Here, we are again looking for the summed 
path lengths so need to use the depth functions D mn for the depth of the leaves in the left 
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boundary 



bulk boundary 


Figure 7: The connecting paths for r — 3 at the edge of the tree. The leaves 
highlighted in red are those which have no connecting path due to the tree edge. 


and right sub-trees. Each has a degeneracy given by the number of trees in the opposing 
sub-tree and the root contributes 1 for each combination. We then need to sum over the 
r different positions that the connecting paths can start and end on labelled by /3 and 
also sum over the sub-tree combinations labelled by a, noting that the limits on the a 
sum are given by when this bulk behaviour is satisfied. 

As a simple example take r — 3 again, shown schematically in Fig. [7J The 3 index 
counts over the three paths connecting leaf 3 in the right sub-tree and leaf a — 2 + j3 in 
the left. The bulk behaviour is present for a = 2 to n — 3 thus: 

n—3 3 

[5'n(3)]blk = EE [C n —a—l^a—2 +/3,q C a Dp n —a— 1 “t - C a C n _ a _ i ] (41) 

a=2 /3=1 

The discussion generalises easily for r ^ (n + l)/2 after which there is no obvious bulk 
contribution as the lower limit on the a sum becomes larger than the upper limit. We 
find 


n—r r 


[5' n (r)]bik = EE [C^n,—a—1 -^q+ 1—r+/3,a “1“ Cq,D( 3^ n — a — i Ccx.Cn—Oi—\\ 

a=r —1 (3=1 


n—r r 


EE [ZCn-a-lDp t a + C a C n - a - 1 ] , T ^ 

a=r —1 (3=1 


n + 1 


(42) 


where we have used the left-right symmetry D rn n = D n+ 2 - m ,n and changed indices. 

The paths where r > (n + l)/2 have equivalent bulk terms due to the left-right 
symmetry of the sub-trees as shown in Fig. [8j In this sense r — n is the same as r = 1, 
r = n — 1 is equal to r = 2 and so on. The contribution to the bulk from these long range 
terms will have the same form as for r ^ (?r + l)/2, but in the limits on the sums r will 
be replaced by n + 1 — r, i.e. 

r— 1 n— r+1 1 

[S„(r)]«k= £ [2C'n-a- 1 ^,a + C' a C7 n - a - 1 ], r> (43) 

a=n—r f3=l 

The boundary terms are limited by the fact that one of the subtrees has fewer than r 
leaves, i.e. there cannot be any r connecting paths. In the boundary case for r (n + l)/2 
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Figure 8: The connecting paths when r = n — 2. The long range paths are 
equivalent to short range paths due to the left-right symmetry of the sub-trees. 



Figure 9: The leaves that are involved in boundary paths when a < n — a — 1. All 
of the leaves are involved in the left sub-tree, only the leaves r away are involved 
in the right sub-tree labelled by r — 1 — a + (3. 

there are a + 1 contributions rather than r, so the [3 index is limited by a + 1. The limits 
on the a sum are given by the set of sub-trees with fewer than r leaves and hence r — 1 
vertices. As shown in Fig. [9] when a < n — a — 1 all of the leaves of the left sub-tree 
contribute but only those that are r leaves away from a starting point do in the right 
sub-tree. This set starts with leaf r — a — 1 and goes up with (3. The a < n — a — 1 refer 
to left hand boundary as shown in Fig. [7J The right hand boundary is identical due to 
the left-right symmetry of sub-trees. The total expression is 

r —2 q+1 72+1 

[*S f n(^)]bnd 2 EE [C n —a— + C a D r —^- |_i ?n _ Q _i + C a C n — a —\ ], r $ (44) 

«=0 /3=l 

where to get to the final line we have summed the f3 index of the second term from top 
to bottom. As with the bulk terms, the r > (n + l)/2 term is identical when replacing r 
with n + 1 — r, 

n—r—1 ol +1 

[<S'n(^)]bnd = 2 EE [Cn-a-l Dp, a + C a D r —/3+l,n—a—1 T C a C n ~ a — i] , 

o=0 /3=1 


r E 


n + 1 


(45) 
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Here we have again used D mtn = Ai+ 2 -m,n and changed the order of the sum in the second 
term. When putting terms together the full equation is given as 


Sn(r) = [S„(r)] b | k + [S„(r)] tad + [S„(r)] , 


(46) 


with 


n—r r —1 


EE (2 C n _ Q _! ■D j3+l ,a “t - C' a C n —a—l') i T ^ 


[^( r )] b ik = < 


a=r— 1 3=0 


r— 1 n—r 

E E (2C J1 _ Q! _i-Dg-|_i ]Cr + C a C n — a —\) , T 

a=n—r 0=0 


77 + 1 
2 

71+1 


(47) 


(‘S.Mlbnd = 


r —2 a 

2 EE -0/3+1 ,a C a D r —fi n — a —i “h Cq,C n — a —\) , T ^ 

a=0 /3=0 
n—r—1 a 

EE (Cn—a—i-^++i,a + C a D r —0 n — a —.\ + C a C n — a —\), r + 

^ Q =0 /3=0 


77+1 

2 

77 + 1 


(48) 


and 


n— 1 


[ s, »M]„c = 2 E C »-»- 1 S »(r)- 


(49) 


a=0 


Note that the limits on the sums imply that 

S n (t ) = 0 Vr > 77. (50) 

We combine the bulk and boundary terms to create an inception, I n (r). The inception 
seeds the recursion and sets the boundary properties. Then (46) becomes 


n— 1 

S n (r) = I n (r ) + 2 ^2 C n - a -iS a (r). (51) 

a=0 

4.3 Leaf-to-leaf path length for separation r = 2 

Using the full master equation it is simple to obtain an expression for next-to-nearest 
neighbours, 

71—1 

S n { 2) = 6C+i - 14C„ + 2C f n _ 1 + 2 ^ C' n _ Q _i 1 S Q (r). (52) 

Ct =0 
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As in section 4.1 we can use a generating function to obtain a closed form for the summed 
path lengths, i.e. 


S(2, x) = ]Ts n ( 2)x n = Y J Sn+i(2)x n+l + So(2) = Y S n+1 (2)x n+ \ (53) 

n =0 n =0 n =0 


where S 0 (2) = 0 due to (50). Here, and for the rest of the section, we shall use S n to 


mean S n (2), i.e. we drop the 2. 
get 


We start with S n+ i, multiply (52) by x n+1 and sum to 


OO OO oo n 

Y S n+l X n+l = - 14 C'n+i + 2 C n )x n+1 +ZYY Cn-aS a X n+1 . (54) 

n= 0 n— 0 72—0 a=0 

Putting in a step function and summing to infinity we obtain 

S[x) = 6 C 2 (x) + 8 - UC(x) + 2xC(x) + 2 xC(x)S(x). (55) 


Collecting the S(x) terms on the right and using 1 — 2 xC(x) = \/l — 4a; gives, after some 
algebra shown in Appendix C.4[ 


C (n-l)(5n-2)^ 
n + 2 


(56) 


To obtain the average path length we need to divide by the number of possible paths, 
which is (n — 1 )C n to get 

5-n — 9 

M2) = ^ (57) 

In the limit n —y oo, we have A n ( 2) —y 5. 


4.4 Larger separations 

Using the same method as above it is possible to get an expression for r — 3 


S n ( 3) 


(n — 2)(13n 2 - 18n + 2)C n 
(n + 2)(2n — 1) 


(58) 


Dividing through by (n — 2)C n gives the average 


A„,(3) 


13u 2 - 18n + 2 
(n + 2)(2n - 1)' 


(59) 


and A n (3) —> 13/3. 
inception function. 


The method can be generalised for any r given an appropriate 
Starting with (51) and using the generating functions S{r,x ) = 
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S n {r)x n and Z(r, x) = Y^= r -i In{r)x n , we proceed as before 


OO 

S(r,x)= Y S n(r)x n 

n=r— 1 


oo n— 1 


X] I n (r)x n + 2 ^ Y C n -a-iS a (r)x n 

n=r— 1 n=r— 1 a=0 

oo oo oo 

X] I n {r)x n + 2 Y Y H "-°‘-i C "-°-i S °‘^ xr ' 

n=r— 1 n=r— 1 a=0 

oo oo oo 

^4(r)x" + 2a;^^(r)a;“ ^ 

r—1 Q;—0 q=r—a— 2 

oo oo oo 

^/„(r)x" + 2x ^ 5 a (r)x“^C/ 


r—1 
oo 


a=r —1 
oo 


9 =0 


^/„(r)x n + 2x ^ 5' a (r)x a C(x) 

r—1 a=r —1 

X(r, x) + 2xC(x)<S(r, x), 


(60) 


where on the third line we have made use of the step function H q and recalled (50). So 
the lower limit of the a sum can be raised to match the rest. Collecting the S(r, x) terms 
and dividing by [1 — 2xC(x)] gives 


S(r, x) 


Z(r, x) 

1 — 2xC(x) 


(61) 


The issue now to construct a general form for the inception. Unfortunately, this turns 
out to be as hard as finding a general equation for the path lengths. It is possible to use 
the generalised master equation to get each single case, but the number of terms grows 
rapidly with r and it soon becomes impractical to obtain these by hand. 


5 Diagrammatic approach to counting rooted paths 
in Catalan trees 

5.1 A diagrammatic approach 

In this section, we develop a diagrammatic notation in which a vertex represents x so 
that we can directly add together diagrams. This will allow us to more easily see what 
patterns develop and we can then replace diagrams with powers of x at the end of the 
calculation. We begin by setting up a diagrammatic dictionary such that 




6 


= C(x) 


(62) 
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This dictionary allows us to construct a diagrammatic equation for the generating function 
C(x), i.e. 



when comparing to Fig. [2} We also notice that the generating function can be generated 
recursively according to the formula 


6 



Translating back into algebra, we recover the fundamental relation ([7]). 


(64) 


5.2 Enumerating external rooted paths 

We now want to find how many binary trees with n vertices there are such that the 
leftmost (or equivalently rightmost) leaf is k vertices away from the root. We will call 
such a configuration an external (because it is on the outer left/right leaf) rooted (because 
it starts at the root) path of length k on a binary tree with n vertices. We will denote 
by Kjk the number of external rooted paths of length k on a binary tree with n — j + k 
vertices (so that there are always at least k vertices). We also define its generating function 
as IC(x,z) = V ;/ . l\ ji,..r J z k . 

In order to highlight the path along the leftmost leaves (we could equivalently chosen 
rightmost) we will define a new diagrammatic dictionary 



A 

/ \ 

/ \ 
i \ 

/ \ 
t \ 




1 

I = )C(x, z ) 


(65) 
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Here the z argument allows to count the number of vertices on (left) external rooted paths 
while the x term denotes the remaining ones (to the right). The diagrammatic expression 
for the generating function /C(x, z ) is then 



From the results of the previous section we can sum over all of the x vertices to get to 



At this point we recognise that the diagrammatic expression 
recursion relation 


(67) is generated by the 


( 68 ) 

which we translate back into algebra according to our diagrammatic dictionary to finally 
arrive at 

lC(x,z) — 1 + zC(x)K,(x,z) — -- . . . (69) 

1 — zC(x) 

In the following it will be useful to notice that, using (J7]) , 

JC(x, x) — C(x) . (70) 

It is also useful to notice that 

d)C(x, z) 2 

———= C(x)JC{x,z) . (71) 

Suppose now we want to know the average length of an external rooted path on a binary 
tree with n vertices. First we can calculate the sum of the lengths of all external rooted 
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paths by taking the number of such paths of length k , multiplying by k and summing, 
then we just divide by the total number of trees. Thus we want to compute £ n = E n /C n 
where E n = kKn-k,k- We can calculate E n as follows 


E n = Y J kK n -k,k = Y k[x n ~ k z k ]{lC(x, z)} 

k =1 k= 1 

n 

fc=l 


k=1 l ~ ) 

[x n ]{xIC( x,x) 3 }. (72) 


In order to calculate the average it will be useful to get our result in terms of the Catalan 
numbers so we continue in that direction, 

E n H [x"]{xC(x) 3 } S [x"]{C{xf - C(x)} i [x"]{®zi} - C„ 

® [x" +1 ]{C(x)} - C„ = C n+ , - C„ (73) 


Now to hnd the average length e n = E n /C n of externally rooted paths we calculate 

(2-n + 2)! (n!) 2 (n + l) 


C n -\-\/E n 


1 = 


(n + l)! 2 (n + 2) (2n)! 


(2n + 2)(2n + l)(n + 1) 4n + 2 


- 1 
■in 


(n + l) 2 (u + 2) n + 2 n + 2 

In the limit n —V oo we hnd that the average length approaches e = 3. 


(74) 


5.3 Enumerating Rooted Paths on Catalan Trees 

Now we would generalise the previous result to look at paths which pass through the 
tree rather than just along the outer edge. We will ask how many binary trees with n 
vertices there are such that the p th leaf in from the left is m vertices away from the root. 
We will call this configuration a rooted path of penetration p on a binary tree with n 
vertices. According to this definition, an external rooted path would be a rooted path 
with penetration p — 0. In order to understand the penetration we need to distinguish 
between those vertices to the left of the path and those vertices to the right of the path. 
We will also need to distinguish between those vertices on the path that veer to the left 
and those that veer to the right. 

This in mind we will try construct the generating function J’(C,,x,z,Q which enu¬ 
merates those trees in containing, (i) i vertices to the left of the path, (ii) j vertices to 
the right of the path, (iii) k left-turning vertices within the path, and (iv) l right-turning 
vertices within the path. We will call the series generated Ji 3 ki = [Cx ] z k Q l ]{ J (£, x, z, £)}. 
Now the diagrammatic method becomes particularly useful. We define a diagrammatic 
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dictionary 



5 


A 

/ \ 

/ \ 


A 


/ \ 

' ' A 

' % = X , 

1 

1 

1 

1 

/ , 

/ . 

J . 

( 

«'■' 4c(i) , 

* = c(0 , 

1 = JC(x, z ) , 




= J((,,x,z,C) ■ 

(75) 


In advance we can tell that anything happening to the right of the path can be neglected 
if we replace the edges in the path with external rooted paths. This lets us skip ahead a 
number of steps and write 



With our experience so far we can sum out the £ vertices, denoting Catalan sub-trees, to 
get 
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We translate back into algebra and find 


= £(x,z) + CC(€)IC(x,z)J(€,x,z,C) 

JC(x,z) 1 

= 1 - c C{OK{x, z) = JC(x, z)-i - CC(0 
1 

“ 1 - zC{x) - CC(0 
i 

■ K&Q- * 1 - zC{x) l - zC{x)1C{tX) 

= J(z,€,(,z) 

The symmetry here is due to the left-right reflection symmetry. We may write this 
generating function as 

x, z, C) = fC(x, z ) + C lC(x, z ) 2 1 _£ ]C fa z ) C (Z) ' (82) 

This is useful because of the known series expansiorQ 

1 , = 1 + (s + t) + (s 2 + 2 st + 2 t 2 ) + (s 3 + 3 s 2 t + 5 st 2 + 5f 3 ) + ... (83) 

1 — sC(t) 

We can hence expand out the generating function as 

J (£, X, z, C) = 1C(x, z) + C JC(x, zf + [( 2 lC(x, z) 3 + £(x, z) 2 ] 

+ [C 3 /C(x, z) 4 + 2^C 2 /C(x, ^) 4 + 2£ 2 C/C(x, ^) 2 ] 

+ [C 4 /C(x, z) 5 + 3£C 3 £{x, z ) 4 + 5^ 2 C 2 /C(x, z) 3 + 5£, 3 (IC(x, z) 2 ] + ... (84) 

For a path to have penetration p we need to have i vertices to the left of the path (£) 
and l right turning vertices (£) such that i + l = p. The easiest way to enforce this is 
to multiply both £ and ( by some dummy variable y within the generating function and 
then look at the coefficient of y p in the resulting expansion. Let us therefore define 

= [y”]{J(y^x.z.yC,)} = [y p ] { t _ y(c(y0 } ■ ( 85 > 

1 This is the generating function of Catalan’s triangle in which an element is equal to the one above 
plus the one to the left as illustrated below 

1 0 0 0 0 0 0 

1 1 0 0 0 0 0 

1 2 2 0 0 0 0 

1 3 5 5 0 0 0' 

1 4 9 14 14 0 0 

1 5 14 28 42 42 0 


(79) 

(80) 

(81) 
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From this we can write down the sum of the lengths of the paths from the root to the p 
leaf (starting from zero), i.e. the depth function for the p+1 leaf defined in Eq. (29) of 
section 13.21 as 


= D p+li „ = V + C)} 

i+j-\-k-\-l=n 


( 86 ) 


Exactly as before we can begin to manipulate this expression to bring it into a simpler 
form 

psl2li 


d{ p)<sj 3 (k+i) ievAVI 

i-\-j-\-k-\-l=n 


E Kv^cVij 

i-\-j-\-k-\-l=n ^ 

E kvacviI 

i -L- —L /—n ^ 


1 - 2C(x) - j/CC(2/0 
1 


9 9 

V 1 -y(C(y£) 

*C(x) , y(C(yO 


i-\-j-\-k-\-l=n 


(1 - zC(x) - y(C(y0f + (1 - zC(x) - y{C(yO ) 




= [*V] 


xC(a;) + xyC[xy ) 

(1 — xC(a;) — xyC(xy)Y 

1 1 
^C(x) + xyC(xy) - 1) + (xC(z) + xyC{xy) - 1) : 


(87) 


A slight simplification can be made to each of these terms, by multiplying top and bottom 
by xC(x) — xyC(xy) due to 1/(1 — xC(x) — xyC(xy)) = (xC(x) — xyC(xy))/(x — xy). This 
allows us to bring the generating function for the summed lengths into the simple form 


A? 1 = [*V] 


C{x) - yC(xy) 

1 -y 


C(x) - yC(xy) 

l ~y 


( 88 ) 


By applying the identity (B.13) we can begin to evaluate the coefficient of x n in this 
generating function, 


D {p) = [x n y p ] 
= [y p ] 


(i -y? 

l 


[C(xf + y 2 C{xy) 2 - 2 yC(x)C(xy)\ - -- \C{x) - yC(xy)\ 

i -y 


(i -y) 2 

l 


[[ x n ]{C(x ) 2 } + y 2 y n [x n ]{C(x) 2 } - 2 y[x n ]{C(x)C(xy)}] 


= [y p } 


i -y 

1 + y n+2 


[[x n ]{C(x)} - yy n [x n ]{C(x)}} 
1 + y n+1 


( l ~yy 


a 


n+1 


i -y 


-c n 


2 y 


{ l ~yy 


[x n ]{C(x)C(xy)}\ , 


(89) 


the final term here can be evaluated using (B.8). We can also notice here that p ought 
to be defined to be less than n (there is no way a path can contain more vertices than 
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the tree it runs through) so we can completely ignore the terms containing y n . Then we 
should recognise that [?/ p ]{(l — y)~ 2 } — {'P + 1) and so we see that 

DS 4 = (p + 1 )C a+l -C„- 2[y p \ I £ C„- t y k C k j 

{ n k+1 ^ 

Y,J^y^ CkCn - k \ ' (90) 

The sum in the final term here contains powers of y running up to n +1 but we only want 
terms with powers of y ^ p, so we restrict the summation accordingly, 

fhd r v k +1 ^ 

£><f> = 0 + 1 )C„, - c„ - 2 y>»] {(TTiy} c * c '»-* 

p - 1 r i i 

= (P + l)C n+1 ~C n - 2 XV""' 1 ] { CkCn ~ k ' (91) 

Finally we arrive at an expression for the summed length of paths from the root to the 
p th leaf in a binary tree with n vertices. 


p -1 


D i p) = (P + 1 )C n+1 -C n - 2 ^(p - k)C k C n . k . (92) 

fc =0 

This is, unsurprisingly, just the form (29) derived in section 3.2, i.e. = Z} p+ln J^] 


6 Diagrammatic approach to leaf-to-leaf path lengths 
in ordered Catalan trees 


Our primary task is to enumerate paths between leafs on the tree. We will say that two 
leafs have separation s (s — r — 1) if there are s leafs sitting between the two. By this 

2 Starting from the form 


D {p) = [x n y p ] 
= [x n y p ] 


[1 - xC(x) - xyC(xy )] 2 [1 - xC{x) - xyC(xy)] 

C(x) 2 C(xy) 2 C(x)C(xy) 


= x 


= \x 


[C(x) +C(y ) — C(x)C(xy)} 2 [C( x ) + C(y) - C(a;)C(a;j/)] 

■ { C(x) 2 C(z) 2 _ C(x)C(z) 

\ [C(x) + C(y ) — C{x)C(z)] 2 \C{x) + C(y) - C(x)C(z)] ^ 

[1 + u(:r)] 2 [l + u(z)] 2 [1 + u(:r)][l + m(z)] 


[1 — u(x)u(z)Y 


[1 — u(x)u{z)\ 


(93) 


(94) 


(p) 

we ought to be able to apply the Lagrange inversion formula and hence find a closed form for D„ ■ The 
expressions thus generated turn out to be quite complicated. 
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definition adjacent leaves have separation s = 0 (r = 1). In order to enumerate leaf-to-leaf 
paths with a given separation it is necessary to distinguish between the left and right arcs 
of the path following the initial bifurcation. Furthermore we need to distinguish between 
those vertices sitting between the two arcs of the path and those vertices sitting to the 
left of the left arc or the right of the right arc. For our purposes, those vertices to the 
left of the left arc need not be distinguished form those to the right of the right arc. The 
vertices within an arc need to be separated into those which turn left and those which 
turn right so our diagrammatic dictionary must contain the following vertices 


A 

/ \ 
t \ 
t \ 



(95a) 

As yet we have made no reference to those vertices sitting above/outside the initial bifur¬ 
cation. To account for these we need to extend our diagrammatic dictionary to incorporate 
the following 



The generating function for leaf-to-leaf paths needs to enumerate those trees containing 
(i) % vertices to the left of the left arc or the right of the right arc, (ii) j vertices to the 
right of the left arc and the left of the right arc, (iii) k left-turning vertices within the left 
arc, (iv) l right-turning vertices within the left arc, (v) k right-turning vertices within the 
right arc, (vi) l left-turning vertices within the right arc, (vii) m vertices above the initial 
bifurcation, (viii) 1 initial bifurcation. The series so generated will be denoted 

M ijk ikim = [Cx 3 z k C l z k C l X m n] {/uM(£, x, z, (, z, (, A)} . (96) 

We will represent the generating function with the diagram 


/^\ = (iM(€,x,z,C,z,C,*) 


(97) 
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We begin by writing out the diagrammatic series 



( 98 ) 

We notice that certain diagrams within this series translate to the same term, for example 




26 



Taking this into account, we factorise the generating function such that 



The second series in parentheses can itself be factorised to give 



The latter two series in parentheses we recognise as the generating function 
paths on the left and right arcs, respectively. So let us write 


= J (f, x, z, C) , » = J (f , x, z, C) , 


\ 


/ 

\ 

+ ... 

(99) 



+ 


/ 

( 100 ) 


(82)of rooted 


( 101 ) 
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using which we have 



The remaining series left to be summed can be seen to be Yl n ( n + 1 )C n \ n which we can 
write as J^AC(A). Altogether we find for the generating function 


[iM(€,x,z,C,z,C, A ) = 0 X J(x,£,z,t) x ^ [AC(A)]. (103) 

The derivative of C(A) can be easily determined given 0 to give -j^C{ A) = C(A) 2 + 

2AC(A)|^C(A) = and M t AC ( A )] = C ( A ) + i—2 ac(a) = C{ i-2xcw ■ Finall y we arrive 

at the generating function 


M(£,x,z,(,z,(, a ) = 


X 


X 


1 - zC(x) - cC(() 1 - zC(x) - C C(0 1 - 2AC(A) 


(104) 


The separation between the two terminal leaves is given by the number of vertices (x) 
sitting between the two arcs of the path plus the number of left-turning vertices (z) in 
the left arc of the path plus the number of right-turning vertices (f) in the right arc of 
the path. This in mind, we can write the generating function for leaf to leaf paths with 
a fixed separation s as 

M (s) (f, x, z, C, z, C, A) = [y s \M (f, yx, yz, C, yz, . (105) 


To find the average length of a path between leafs with separation s we first need to sum 
over all trees with n vertices, the number of paths with length k + l + k + 1 + 1 multiplied 
by that length. The factor of 1 here comes in because we always need to account for the 
initial bifurcation given by the /i vertex. We therefore define 

S«= v (k + l + k + l + l)M% llIm . (106) 

Comparing with the definitions used in section [4| we realise that 

= S n (s + 1), or, equivalently S d ’ -1 ^ = S n (r). 


(107) 










We have manipulated expressions like (106) before so the way forward is clear. First using 
(B.12) we simplify the summand 


(k + l + k + l + 1 )M^ kl - kIm 

= (k + i + k + J+ i)[^z k c l zH T \ m y s } {A<(£, yx,yz, C, yz, C, A)} 

= iexiz k ( l z k ( T \ m y s ] { (z^ + + c| + 1 ) M(t, yx, yz , C, yz, C, A) 


The derivatives can be performed straightforwardly, 


+ Z ~^ + ^ + ^ + 1 ^ M (£, yx , yz ,t, yz ,t,>) 


(108) 


1 — 2AC(A) 1 - yzC(yx) - (C(£) 1 - yzC(yx) - (C(£) 


x 


yzC(yx) + C C(0 + yzC{yx ) + <£(£) 


+ 1 


1 -yzC(yx) - CC(0 ' 1 - yzC(yx) - QC(£) 
Applying the formula (B.3) to £, x, z, z, ( , ( and A, S'n is reduced to 

2[xC(a;) + xyC(x 


(109) 


s<*> = [i-V] 


1 — 2xC(x) 

Simplifying slightly using ( |B.10 ) we arrive at 

S<*> = [*V] ' 1 1 


+ 


[1 — xC(x) — xyC(xy)] 3 [1 — xC(x) — xyC(xy)} 


1 — 2 xC(x) [1 — xC(x) — xyC(xy)] 2 [l — xC(x) — xyC{xy) 


. ( 110 ) 


As an example we consider the summed path length between nearest neighbouring leaves 
on a tree with n vertices. We have Sl°^ = [ x n ] j 1 n —vww? ^ n ,_ s — 1 \ = [x n l $ 2Clx ^ 


1—2 xC(x) [1— xC(x)] 2 


This is the expression derived earlier in (37). 


1— xC(x) 




The general generating function can be written in terms of C alone 


= [x n y s ] 


( C(x) - 1 

C(x)C(xy) 

2 

| k C(*)(2-C(x)) 

C(x) + C(xy ) — C(x)C(xy) 

_ 


C(x)C(xy) 


( 111 ) 


Again using C(x) — 1 + u{x) this becomes 



j u{x) 

"(1 + u(x))( 1 + u(xy)Y 

2 

"J 1 + w(a;)) (t + uOzu/)) / 

\ 

11 — u(x) 2 

1 — u(x)u(xy) 


_ / \ / \ -L 

1 — u(x)u(xy) 

i 

(112) 


! -3CQk)+1 
2 xC(x) 
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If we say z = xy, we can write 
S<f> = [I"-’/] 


| u(x) 

(1 + u(z))(l + u(z)) 

2 

oO + WX 1 +«(-)) 

\ 

^ 1 — u(x) 2 

1 — u(x)u(z) 


1 — u(x)u(z) 

1 


( 113 ) 


We can now write the summed path length as a contour integral 

c(s) - 
— 


jf dx dz 1 1 u(x) 

(1 + u(x))(l + u(z)) 

2 

o( 1 + M ( a; ))( 1 + M (0) 

f 2tt i 2m x n ~ s+1 z s+1 1 — u(x) 2 

1 — u(x)u(z) 


Z-l . \ . \ X- 

1 — u(x)u(z) 


(114) 

Given that x = u{x)/{ 1 + n(a;)) 2 we can write the integrand without direct reference to x 
or 0 


oO) _ 


dx dz [1 + u{;x)] 2(jl s+1 ) [1 + w(z)] 2 ( s+1 ) u(x) 
2ni2m u(x) n ~ s+1 u(z) s+l 1 — u(x) 2 


x 


(1 + m(x))(1 + u(z)) 

2 

oO + ^OX 1 + “(-)) 

1 — u(x)u(z) 


z-i . \ . \ A. 

1 — u(x)u(z) 


(115) 


We can make a change of variables £ = u(x), ( — u(z) (which maintains the contour of 
integration) using ^ to get to 


S (, )= 4 rf£ e(1 _ c2) .( 1 + 0 2( ’*-' ,) (! + C) 2 * i 


2ni 2ni 


c n - + 1 C* +1 (1 - (0 2 


= r-vi «i-c 2 )(i+e) 2< ”-*>(i + c) 


2s 


(WO 2 


r, (l + £)(1 + C) _ . 

WC 

n (i + 0(1 + C) _. 

1 - £C 


, (116) 


which is an alternative form of the generating function. This form makes calculation of 
the nearest neighbour summed path length simpler. For example, for s = 0 (r = 1), we 
have Sn ] = in agreement with (38). 


7 Average leaf-to-leaf path lengths in an ordered Cata¬ 
lan tree 


i.e. 


The number J ,0 of rooted paths of penetration p as given in (85) can be simplified further, 

J {p) = W] 


1 — xC(x) — xyC(xy) 


= [x n y p ] 


C(x) - yC(xy) 


1 -y 


= [x n y p ] < 1 J J ' yX " c(x) \ = [/] f 1 V 


i -y 


,n +1 


i -y 


\ =m {y 


. k=0 


= C n 9(n-p) 


(117) 
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where (~)(k) — 1 if k ^ 0 and 0 otherwise. We have seen that the summed length of 
rooted paths of penetration p is given by Dn' 1 as derived in Eqs. (92) and (30), so that 
the average length of a rooted path is given by 

i (p) 


4 P) = 7 ^ = (P + I)^ i 


1 ~ 2 ^2(p~ k ) c > 


c n - 


n—k 


k =0 


c„ 


Dp+i n 2(2n + l)(p +1) ,C n+ i 

—h (pH- 1) 


C n 


ti ~\~ 2 C n 

C V + l)(p + 2)C p+ i(2n - 2p + l)(n - p + 1 )C n - p 


(ji + l)(n + 2 )C n 

The number of leaf-to-leaf paths with separation s on a tree with n vertices is 

1 1 


1 . (118) 




= \x n ~ 1 ~ s z s ] 


= \x n ~ 1 ~ s z s ] 


= (n — s) [x 


= (n- s)[x n y s ] 


[1 — xC(x) — xyC(xy)} 2 1 — 2 xC(x) 
1 dxC(x ) 

[1 — xC(x) — zC[z)] 2 dx 
d 1 

dx 1 — xC(x) — zC(z) 

1 


n— 1—s+1 


1 — xC(x) — zC(z ) 

1 


1 — xC(x) — xyC(xy) J 


so we see that 


Mil = (n ~ s)4’1 


(119) 


( 120 ) 


The summed length Sn' 1 of all leaf-to-leaf paths with separation s on a tree with n vertices 

dxC(x) 


as given in (110) can also be simplified via 

2 


= [x n ~ l y s ] 


= \x n - l ~ s z s ] 


= \x n ~ l - s z s ] 


= ( n~s)[x n y a ] 

= (n - s)dM . 


[1 — xC(x) — xyC(xy)] 3 [1 — xC(x) — xyC(xy)} 2 J dx 
2 1 dxC(x) 

[1 — xC(x) — zC(z)] 3 [1 — xC(x) — zC(z)] 2 dx 

d r i i 

dx [1 — xC(x) — zC(z)] 2 [1 — xC(x) — zC(z)] 

1 1 


[1 — xC(x) — xyC(xy)] 2 [1 — xC(x) — xyC(xy)] 


( 121 ) 


The conclusion we come to is that the summed length of leaf to leaf paths is related 
directly to the summed length of rooted paths by 


S<*> = (n - S )£>P 


( 122 ) 
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and hence the average length of a leaf to leaf path with separation s on a tree with n 
vertices is 

We therefore see that is identically that of a rooted path with penetration s + 1 of 
a tree with n vertices, divided by C n . With the explicit form of given in (30), we 
hence finally write 


= A n (s + l) = A n (r) = 


2 r(r + l)(2n - 2r + l)(2n - 2 r + 3) C r C n . r 


(n + l)(n + 2) 


C n 


1. (124) 


We then recover the results for A n ( 1), A n (2 ) and A n (3) of 
as well as obtain 

A ^ _ n[n(31n - 105) + 83] - 6 
~ 4n 3 - 13n + 6 ' 


, (57) and (59), respectively, 


(125) 


with lim^oo A n ( 4) = 31/4. In Fig. 10, we show the behaviour of A n (r) for various large 


n. 


There is a simple geometric interpretation of the result of (122). For example, take 


a leaf-to-leaf path and deform the tree such that the left hand leaf is above the root, 
making the first vertex the new root. Now it is clear that any path with separation r can 
be expressed as a leg depth of penetration r, as shown in Fig. [TT[ As the set of Catalan 
trees is complete in the sense that all possible binary trees are part of the set of Catalan 
trees, the newly deformed tree is also one of the set. 


By observation, in for example Fig. 12, one can see that (n — r + 1) paths map to each 


unique rooted tree depth. The degeneracy coming from the number of possible vertices 
to the right of the right hand leaf which can be the original root, highlighted in Figures 


11 and 12 by the dashed edge. 


8 Asymptotic properties of leaf-to-leaf path lengths 
in an ordered Catalan tree 


What we are particularly interested in is the behaviour of the average properties of paths 
on ordered Catalan trees in the limit as the number of vertices grows very large. We recall 
Stirling’s approximation, that in the limit of large n n\ ~ y/2nn ((()". Accordingly, we 


see that C n 


1 1 o2 n 


for large n. Therefore, we find from ( 124[) that 


Aoo(r) = lim A n {r) = 8r ( r + 1 ) ( g r _ 


In the continuum limit, 0 r -C n, we have 


A ( \ / 64r 

^co(r) ~ \l — 


( 126 ) 


as also shown in Fig. 10 
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Figure 10: Average length of a leaf to leaf path A n (r) versus separation r for small n — 50 
(solid black line), 100 (solid red), 200 (solid blue) and large n = 5,000 (dashed black 
line), 10, 000 (dashed red), 20, 000 (dashed blue). The solid green line denotes A^r), the 
dotted line corresponds to r while the dashed-dotted line shows the corresponding result 
of Ref. |3| for complete binary graphs. 


9 Conclusions 


As shown in Eq. (|122|), the summed length of leaf-to-leaf paths is directly related to the 

we argue 


summed length of rooted paths in Catalan tree graphs. In Figures [XT] and 12 
that there is a straightforward geometric interpretation of this result. We think that this 
is a rather nice insight to emerge from our calculations. 

The final results for A n (r) and A^r) are quite surprising to us when comparing, as 
done in Fig. 10 with the corresponding expressions C n (r ) and £oo( r ) for complete binary 
trees |3| and the tree graphs emerging from the tree tensor network studied in Ref. |4|. 
We find that the A-path lengths take larger values than the £’s. This can be understood 
as follows: in order for n vertices to form into a complete binary graph, those vertices are 
“packed” very densely, hence giving rise to a small C n (r). On the other hand, Catalan 
graphs with n vertices can have a much more extended structure, leading to larger path 
lengths A n (r) for the same r. Comparing to the tree tensor networks, this result shows 


33 















r = 1 


r = 2 





Figure 11: Graphical representation of the equivalence between path length on a Catalan 
graph with n = 3 nodes and a rooted path connecting the same nodes created by deforming 
the tree. All possible paths on a particular tree are shown, paths with separation r are 
mapped to trees with penetration r. Symbols and lines are as in fig. [lj 



Figure 12: Example of the degeneracy of rooted paths, showing the (n — r + 1) =3 cases 
where n = 3 and r = 1 that map to the same rooted tree. Symbols and lines are as in fig. 

m 

that our naive expectation expressed in the introduction, is wrong: the path lengths are 
quite different for larger r values and hence the implied correlation functions have a very 
different long distance behaviour. Just as for complete binary graphs, it remains to be 
seen what Hamiltonians, if any, correspond to these correlations. 


A More relations using Catalan numbers 

A.l Extensions of Segner’s relation 

The following two relations will be useful. First, let us show that 


{n + l)C n+ 1 — q + 1 )C q Cj 


n—q • 


(A.l) 


q =o 


Using Segner’s relation ({T]) , the right hand side of (A.4) can be written as a square of 
terms that takes into account the degeneracy given by (2 q + 1). This can be seen by first 
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considering degeneracy (q + 1), 


+ 1 )C q C n . q = c 0 c n + + C 2 C n _ 2 + ■■■ +c n c 0 

q=0 

+ 0 + C\C n -\ + C 2 C n — 2 + • • • -\-C n Co 
+ 0 + 0 + C 2 C n — 2 + • • • +C u Cq ... 


(A.2) 


Multiplied by two, this completes the square but double counts the terms along the 
diagonal, which need to be removed. The complete square is then (n + 1) copies of 
Segner’s relation, 


In 


1 )C n+1 = 2 + 1 )C q C n _ q - Y, C q C n - q 

q =0 q =0 


+ 1 )C,C, 

9=0 


n—q • 


(A.3) 
(A.4) 


The second, very similar relation is 


(2n + l)C n = ^(g + l)C',C', 

9=0 


n—q • 


(A.5) 


Starting with Eq. (A.3) and noticing that the final term is just Segner’s relation for C n+ 1 , 
we can write 


(n 


l)Cn+i — 2 + l)C q C n _ q — C n+ i 

q =o 

(n + 2)C n+ i 


J >+1 


9=0 


Using the recursion relation (|2]) gives 


(n + 2)2(2n + l) 
2 n + 2 


Cn = Y^ + l )^ 1 


n—q 


9=0 


(2n + ^ ^ (g + ljCqU, 

g =o 


n—q- 


(A.6) 


(A.7) 


A.2 Incomplete Segner relations 


We investigate the generalisation of Segner’s relation (Jl| to 
consider 

v 


Spn 




n—k 


k =0 


an incomplete sum and so 


(A.8) 
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We claim that 


Spn 


, 1 (2p+l -n)(p + 2)(n-p+l)^ ^ 

2 ” +1 2 (n + !)(« + 2) " +1 


(A.9) 


This may be proven by induction, using only the recurrence relation for the Catalan 
numbers 

9 n 4- 1 

C n+1 = 2— -C n . (A.10) 

n + 2 

First we confirm its validity for p — 0, i.e. 


^0 n 


1 , l (l-n)(2)(n + l) 

2 ” +1 2 (n+l)(n + 2) 

0 

c n = Y J c k c n - k . 


±± r i 

n + 2 n (n + 2) 


(A.ll) 


Assuming the expression to be valid for p, we proceed to show that this implies the validity 
for p + 1, i.e. 


< ->p+l,n '-’pn “1“ Cp+l^n—(p+1) 

= 2^ n+1 

1 (2 p + 1 — n) (p + 2) (n — p + 1) 
_2 (n + l)(n + 2) 

= 2 < ^ n+1 

1 (2p + 1 — n) (p + 2) (n — p + 1) 
_2 (n + l)(n + 2) 

— oC+i + 

(2 p + 1 — n)(p + 2) (2 (n — p) — 1 


"C n _p T C'n—p—l 


~ 1) + 

1 ) + 2 n 


2 C n +1 + 

^Cn+l + 


-Ly | 

(2p + 3 — n) (n — p) (2p + 3) ^ ^ 

(n + l)(n + 2) n - p ~ l 1 

{2p + 3 - n) (n - p) (2 p + 3) 

(n + l)(n + 2) n_ 

1 \2(v + 11 + 1 -n\\(v+ 1) + 21 


yn — p 
C n —p—i T C n —p—\ 


C P+ i 

-p—1 T C'n—p—l 

C p+ i 


c t 


'p +i 


'p +i 

1 P + 3 


and so the formula is proven by induction. As a final check we can look at the complete 
sum s nn which should reproduce Segner’s relation 


_ 1 1 (2n + 1 — n) (n + 2) (n -n + 1) 

“ 2 n+1 + 2 (n + l)(n + 2) n+1 n " n 

= ^C' n+1 + iCn+iCi 

= C n+1 . (A. 13) 
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Another relation that is useful 


p- 1 1 

y — k)C n -kCk = + jyr^- — 2^ {(2^ + l)(p + l)( ri + l)Cn 

-(2p+ 1)0+ l)Cp[2(n -p) + l](n - p+ l)C n _ p } . (A. 14) 


Just as for (A.9), the relation can be proven by induction, after some algebra. 


B More on generating functions 

From elementary calculus we know that the series coefficients of a(x ) can be expressed 

1 d n a(x) 


as 


19 


[x n ]{a(x)} = 


n\ dn n 


x=0 


dx 1 a(x) 
2711 x X 11 


(B.l) 


Below, we list some particular generating functions and their associated series coefficients. 
An useful identity is the Lagrange inversion formula : Suppose u is a function which can 
be expressed as u(x) = xf>(u) where the function 0 satisfies 0(0) = 1, then for a given 
function /, we have 

[*"]{/(«(*))} = rK-'K/'WM"}. (B.2) 


n 


Generating functions can also be defined for multi-indexed series a^k... 19 . We will extend 
our notation so that [x^x'f 2 ■ ■ ■ x™ T ]{f(xi, X 2 , ■ .., x r )} is the coefficient of xf^xf 2 ... xf r 
in the Taylor series of f(xi,X 2 , ..., x r ). A convenient identity for dealing with generating 
functions of many variables is 


XT r i/“ 1 {“(*.!/)} = lx n ]{a{x,x)} . 


(B.3) 


r=0 


As an example of the use of the notation for generating functions, following Ref. 19 
we list some particular generating functions and their associated series coefficients, 


[x n ]{x p } = S n , p , (B.4) 

[x n ]{e x } = l/n\ , (B.5) 

= (B.6) 

M{(1 +!)”}= iff) ■ (B.7) 

If [x n ]{a(a;)} = a n and [x n ]{6(a;)} = b n , it is simple to determine that 

n 

[x n ]{a(x)6 (x)} = (B.8) 

k=0 
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Raising and lowering of indices in series coefficients can be achieved using the following 
trivially demonstrated formulas 


[x n 1 ]{a(^)} = [x n ]{xa(x)} , 

[x” +1 ]{a(U} = . 

Differentiating Eq. ([3]) we notice that 

da(x) 


\x 


dx 


= n[x n+1 ]{a(x)} , 


(B.9) 

(B.10) 


(B.ll) 


combining this with Eq. (B.9) we get the useful relation 

n[x n ]{a(x)} = [ x n ] ix-^-a(x) \ . 


(B.12) 


From the series definition it should be clear that constant multiples of the argument can 
just be brought out front as follows 


[;r n ]{(a(ar)} = c n [x n ]{a(x )} . 


(B.13) 


The proof of the Lagrange inversion formula (B.2) is very simple, applying the identities 
stated above [19] 

1 / dx 1 1 dfdu 
n J 2ni x x n du dx 
1 / dx (4>(u) \ n df du 
n J 2 ni \ u J du dx 
1 f dx du 4>(u) ... 

f W 


n J 2niu dx u n ~ 1 

1 


du 


n J 2 niu u 

= iK-'K/'WM”}. 
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Similarly, (B.3) can be shown as follows, 


El*V~ 


]{a(i,j/)} = ]T 


d n-r X Qr 


r =0 


— J (n — r)! dx n ~ r r \ dx 7 


a(x,y ) 


r=0 


x,y=0 


i (:) 


Qn—r Qr 

n\ ^ \r ) dx n ~ r dx' 1 

r=0 ' ' 


-.a{x,y) 


x,y =0 


1 f d d y . . 

= n\[dx + Y) ^ 

1 d n 

= ———a(t + s,t — s) 
n\dt n v ' 


x,y=0 


s,i=0 


{t= \(x + y) , s = \(x-y)) 


1 d n 


t =o 


= [t n }{a(t,t)} . 

Of course t here is a dummy index so we can call it x or whatever we like. 


C Details of some derivations 


C.l The derivation for D 


m.n 


We drop the index in Eq. (23) on the first sum to zero and introduce a step function 
(H p = 1 if p ^ 0, 0 if p < 0) so that the upper limit can go to infinity: 


m —2 


n — n 


n-\-m 


^ ^ Dm,k-\-m—lC'n—kH n —k ^ ^ D m —k—l,n-\-m—k—lDk • 


(C.l) 


k =0 


k =0 


Multiply both sides by x m+n and sum from n = 0 to oo to match the generating function. 
Note that we use n + m rather than n + m — 1 as in the generating function so that the 
index on the Catalan numbers is never negative. 


Y,D m , n+m x m+n 

n =0 


Dm , k-\-m— 1 C n —k H n —X 

n =0 n=0 k =0 

oo m —2 

+ EE D m —k—l,n+m—k—l C k x 


n+m 


n+m 


n =0 k =0 


(C.2) 
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The left hand side can clearly be expressed in terms of T> m (x): 


^ ^ ^ ^ D mn j t _ rn _\X 


72 + 772 —1 


- D n+m -1 _n r 

/ J J - y 772,72+772—I- 0 U 772,772— l**' 

72—0 

= £> m (x) - An.m-is" 1 ' 1 . 

The first term on the right can be written in terms of C(x) as 


(C.3) 


E C rrn+m _ \ ft 

72 + 772 *^ / J ^77 


X) C n x n - C n 


C(X) - J2 C n 


(C.4) 


The second term can also be written in terms of generating functions: 


^ Y D m . k+rn _iC n - k H n _ k x n+m = ]T Dm, 


m+k \ ■* fi tt rf, n —k 
k+m—l-L / ^'n—k^n—kX 


n=0 k =0 




^ C++ 


= xT> m (x)C(x), 


(C.5) 


where we have used l — n — k on the second line. The final term is similarly 


Yj D m -k-l,n+m-k-lCkX n+Tn — Y^ CkX k+1 Y^ D m - k -l,n+m-k-lX 


72+772— k — 1 


72—0 k= 0 


Y Ckxk+1 Y D 


2— k —1,72+772— k —2*^-' 


72+772— k —2 


_ D 771—fc —2\ 

-^-^772—/c— 1,772— k —2*^ J 

772—2 

]T C fc + +1 (D m _ fc _!(x) - D m _ k _i )m _ k _2X m ~ k ~ 2 ') 


(C.6) 
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When put together this gives 


m —1 

T>m(x) - ix m_1 =C(x) - ^ + xV m (x)C(x) 

n =0 

m—2 

+ CkX k+1 {p m _ k _ i(x) - -D m -fc- 1 ,m-fc-2^ m_fc_2 ) . (C.7) 

k=0 


Collecting the V m (x) on the left then using Eq. (|4]) and D m ^ n = D n+2 -m,n results in Eq. 


We now continue from Eq. (28). We start by writing 

OO OO 

E c ' 


x 


'C(x) = £ C, 


n+m _ 

•A/ - 


.— m.X • 


(C.8) 


n =0 


k=m 


Inserting (28) gives 


m—2 

f m (x)C(x) + g m (x) = C 2 (x) +C(x)^2 CnX n [xfm- n -i(x)C(x) + xg m - n - i(x) - 1] . (C.9) 

n=0 

In the spirit of the ansatz we use ([6]) to remove the C 2 (x) terms and make the expression 
dependent on just C(x): 

rf •) i m ~ 2 

fm(x)C(x ) + g m (x) = — - h C(x) CnX n [xg m - n -i(x) - 1] 

X x 

n =0 
m—2 

+ (C(x) -l)J2 W™-n-lWC(l). (C.10) 

72—0 


Collecting coefficients of C(x) gives 


fm{x ) 


1 

X 


772—2 

“ 1 “ ^ ^ C n X [/m—72—1 (^) “ 1 “ ^CQm—n— l 1] 5 

72—0 


(C.ll) 


leaving 

- 772—2 

9m{x) = - y2 C n x n fm-n-l(x). (C.12) 

x 

72—0 

We can now write out f m (x) and g m (x) for the first few m. Recall that the above 
expressions are only valid for m ^ 2, we therefore obtain fi(x) and g\(x) from to 
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seed the other terms. The first seven f m {x) are 


fl(x) = - - 1 

X 

(C.13a) 

f' 2 {x) = - - 3 

X 

(C.13b) 

h(x) = -~5-2x 

X 

(C.13c) 

4 

fAx) =-7 — Ax — Ax 2 

X 

(C.13d) 

5 

fA x ) =-9 — 6x — 8x 2 — 10a ; 3 

X 

(C.13e) 

fj x ) = - - 11 _ 8 x - 12 a ; 2 - 20 a ; 3 - 28a; 4 

X 

(C.13f) 

7 

Mx) = - - 13 - 10x - 16a; 2 - 30a; 3 - 56a; 4 - 84a; 5 

X 

(C.13g) 

and for g m (x) 


9i{x) = ~~ 

X 

(C.14a) 

g 2 {x) =-h 1 

X 

(C.14b) 

, , 3 

gz\x ) =-h 2 + X 

X 

(C.14c) 

. . 4 9 

^ 4 ( 3 ;) =-h 3 + 3a; + 2ar 

(C.14d) 

5 

5-5 (a;) =-h 4 + 5a; + 7x 2 + 5a ; 3 

X 

(C.14e) 

6 

,r/ 6 (x) =-h 5 + 7a; + 12a ; 2 + 19a; 3 + 14a; 4 

X 

(C.14f) 

7 

gAx) =-b 6 + 9a; + 17a; 2 + 33a ; 3 + 56a; 4 + 42a; 5 . 

X 

(C.14g) 

It is possible to spot patterns in these equations and write a general formula for each 

Tfl 

fm(x ) =- 1 - 2H m - 2 C 0 (m - 1) - 2H m _ 3 Ci(m - 2)x - 2H m _ A C 2 {rn 

X 

— 3)a ; 2 

- 2H m .rX';i{ni - 4)a ; 3 - ... 

(C.15) 

m— 2 

=- 1 — 2 — k — 1 )CkX k 

x k =0 

(C.16) 

Tfl 

gm(x) = -h H m - 2 (C\(rn — 2 ) + Co) + H m - 3 (C 2 (m — 3) + C\)x 

X 


+ — 4) + C 2 )a ; 2 + — 5) + C 3 )x 3 + ... 

(C.17) 

m —2 

— —7 + 'y ^ \{jn — k — 2 )Cfc+i + Ck] x k . 

X 

k =0 

(C.18) 
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In the first lines of each a step function H p has been introduced to cut the expression 
so that it gives only the terms allowed for that m. This is neatly included within the 
summations in the final forms. Finally, multiplying (C.16) by C(x) using (C.8) yields 

m—2 

(C.19) 


fm(x)C(x) = 

n =0 
oo 

E 


m 

x 


C n x« 


n=m— 1 


m 


— 1 — 2 22( m — k — 1 )C k x k 
k =o 

m—2 

c„ - 2 y (rn -k - 1 )C k C n _ k 

k =0 

m—2 m—2 m—2 

22 C n+ 1 X U - 22 C n x nl )CkC n - k x n . (C.20) 


rnC, 


Tl +1 


m—2 


71— — 1 


71—0 


k =0 n=k 


This implies Eq. (29). We now just have to show that the rest of the terms are taken care 
of by g(x). Using the double sum identity 


(C.21) 


fl—0 b=a 


b =0 a =0 


where o,6,c 6 Z + , the final term in Eq. (C.20) can be simplified 

m—2 m—2 


m—2 7i 


k =0 n=k 


J2J2( m ~k- 1 )C k C n . k x n = Y, ~ k ~ l )C k C n - k x n 

Ti—O k =0 
m—2 7i 

= [mC k C n - k - (k + 1 )C k C n - k ] x n 

n =0 k =0 
m—2 

= i mC n+i ~ (2 n + 1 )C n ] x n , (C.22) 


71—0 


and in the final line we have used Segner’s relation on the first term and (A.4) on the 
second. Thus (C.20) can be written as 

m—2 


f m (x)C(x) = ^2 D m,nX n +- - 22 \- Cn - mC n+1 + 2mC n+ i - 2(2 n + 1 )C n ] x n . (C.23) 

X 

n=m—l Ti—O 

Using the recursion relation for Catalan numbers ([2]) then gives 

oo m—2 

f m (x)C{x)= 22 D m , n x n + — — 22 X'n + mC n+ i — (n + 2)C n+1 ] x n 


n=m—l 

oo 


71—0 

m—2 


= 22 D m , n x n +—— 22 t( m — n ~ 2)c , n+1 + c n ] x r 

n=m— 1 71—0 

-^m(^) Qm 


(C.24) 


as desired. 
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C.2 Simplification of S n ( 1) 

It is possible to simplify equation (p3l) in the following way 


71—1 


S n (l) = ^(C a D 

n—a,n—a— 1 “1“ C n —a— Cn—a—1 Ccl H“ S n —oi—\Ca Scx.Cn—a— l) 

O! = 0 

n—1 

^ ^ (-^l,n-a-l^a ”t“ C^\ a C n — a _i S n — a —iC a “f" S a C n — a —i ) ~\~ C n 

a.=Q 

n—1 n—1 

— 2 Di :a C n - a -i + 2 y] S a C n - a -1 + C n 

a=0 a:=0 

n—1 n—1 

— 2 yy (Ca+l — C*a) Cm-a-1 + 2 ‘S'cUn-a-l + C n 


a=0 

n—1 


a=0 


2 C a+1 C, 


n—(a+1) 


a=0 


n—1 


n—1 


2 y ^ C a C n - a -1 + 2 y ^ S a C n - a -1 + C n . (C.25) 


a=0 


< 2=0 


We now set f3 = a + 1, add and subtract 2CoC' n -o to change summation limits. This 
yields 


S n (l) = 


n—1 


2 ^ CpC n _p — 2(7, 

. 0=0 

n—1 

2C n+ i — 3(7 n + 2 y ^ iS'aUi 

n—1 

Z> 2 , n + 2^5 a C' i 


— 2(7 n + 2 yy S' Q C , n _ Q _i + C n 


a=0 


< 2=0 


n—a—1 


y n—< 2 —1 


(C.26) 


< 2=0 


as in (34). 


C.3 Useful expressions used in calculation of S n (l) 


In the derivation of the r = 1 path length in equation (37) it is necessary to simplify 
various expressions involving 1 /(a/ 1 — 4a;). 


d 


y~^(q + 1 )C a x a = C(x) + x—C(x) 


< 2=0 


dx 

1 - y/l-Ax 


2x 


+ 


_\/l — 4x 


1 — y/1 — Ax 
2x 


1 


y/1 — 4x 


(C.27) 
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This result is then used for the other parts of equation (37) 


C{x) 
y/1 — 4:X 


1 )^ 

a=0 (3=0 

oo 

(3 +1 )c a Cpx a+ ^ 

a,fi=0 

oo n 

n=0 /3=0 
oo 

^( 2 ? r + l)C n x n , 

n=0 


where in the last line we have used (A.5). Finally 

C 2 (x ) 


y/1 — 4x 


= C(x) 


C(x) 


y/1 ~ 4:X 


J2C a x a J^{2f3 + l)C^ 

a=0 (3=0 

oo 

J2 W + 0CaCpX a+ ^ 

a,(3=0 

oo n 

2/3 + l)C„ -pCpX n 

n =0 /3=0 
oo 

n=0 


where in the last line we used (A.4). 


C.4 Computing S n (2) 

From ( [54] ), we use a step function and sum to infinity to obtain 

£ OO OO oo oo oo 

S(x) = - E C n+ 2 X n+2 - 14 Y, Cn+lX n+1 + 2 X ^ + 2 X E H *-° 

n=0 Ti—O n =0 n =0 a=0 

6 °° 

= - [C(x) — x -»1] — 14 [C(x) — 1] + 2 xC(x) + 2 x H q C q x q S a x a 

33 

q=—a. 

6 00 

= — [C(x) — 1] + 8 — 14 C(x) + 2 xC(x) + 2x C q x q S a x a 

x g=0 

= 6 C 2 (a:) + 8 — 14C(x) + 2xC(o:) + 2 xC(x)S(x) 


(C.28) 

(C.29) 


(C.30) 

» C ™n 
n—a^a^ 

(C.31) 


45 











Then we can write 


y/l - 4 xS(x) = 6 C 2 (x) + 8- 14 C(x) + 2 xC(x). (C.32) 

Then, using the relations of section [A] the final result can be obtained 

OO 

S ( x ) = Y [6(77, + l)C n+1 — 14(2n + 1 )C n + 8(71 + 1 )C n + 2x(2n + l)£7 n ] x n 


n =0 
00 


^ ^ [ 6 ( 77 , + l)C r n _(_i — (20 tt- + 6)x 71 + 2 5^(2n + l)C7 n _iX n 

n=0 n=l 

00 

^ ^ [ 6 ( 77 - + 1)0^1 — (20tt- + 6 )C n + 2(27-1 + l)C n —i\ x n + 6 C 1 — 6 C 0 

n=l 

00 

Y t 6 ( n + iJC'n+i - (20n + 6 )C n + 2(2 n + l)C n _i] x n . (C.33) 


n= 1 


This now gives an expression for S n ( 2), which can be tidied up using Eq. ([ 2 ]) , i.e. 
S n (2) = 6 (n + 1 )C n+1 - (20 n + 6 )C n + 2(2 n + 1 )<?„_! 

12(n + l)(2n + 1) 


n + 2 

(n — l)(5n — 2) 
n + 2 


-C n - 2(10n + 3)C' n + 2(n + 2)C'„ 


C„ 


(C.34) 
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